Recommending Component by Citation: A Semi-supervised Approach for Determination
نویسندگان
چکیده
Reusing existing components can help developers improve the development productivity as well as reduce the cost. Reuse repositories in this scenario act as a fundamental facility for acquiring needed components. While retrieving components in reuse repositories, developers often face the problem of choosing components from candidates that provide similar functionalities. To address the problem, this paper proposes a semi-supervised method to recommend developers components in reuse repositories. Different from existing rating based recommendation approaches that often suffer from the lack of user ratings, our approach calculates the recommendation probabilities of components based on their citations on the Internet. The citations are acquired through the websites (called host in this paper) that are associated with the components. Using a random walk algorithm, the associations between components and hosts are explored with recommendable components identified. We implemented our approach in a prototyping system based on which we conducted an experimental study to evaluate our approach. The experimental results demonstrate that our approach can accurately recommend components and thus has the potential to assist developers in reuse. Keywords-software reuse;component recommendation;reuse repository
منابع مشابه
Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملOn Propagated Scoring for Semi-supervised Additive Models
In this paper, a semi-supervised modeling framework that combines feature-based (x) data and graph-based (G) data for classification/regression of the response Y is presented. In this semi-supervised setting, Y is observed for a subset of the observations (labeled) and missing for the remainder (unlabeled). The Propagated Scoring algorithm proposed for fitting this model is a semi-supervised fi...
متن کاملSemi-Supervised Classification with Graph Convolutional Networks
We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden lay...
متن کاملGraph Convolutional Networks
We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden lay...
متن کاملExtracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011